|
In information theory and computer science, the Damerau–Levenshtein distance (named after Frederick J. Damerau and Vladimir I. Levenshtein〔) is a distance (string metric) between two strings, i.e., finite sequence of symbols, given by counting the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters. In his seminal paper, Damerau not only distinguished these four edit operations but also stated that they correspond to more than 80% of all human misspellings. Damerau's paper considered only misspellings that could be corrected with at most one edit operation. The Damerau–Levenshtein distance differs from the classical Levenshtein distance by including transpositions among its allowable operations. The classical Levenshtein distance only allows insertion, deletion, and substitution operations. Modifying this distance by including transpositions of adjacent symbols produces a different distance measure, known as the Damerau–Levenshtein distance.〔. The isbn produces two hits: a 2007 work and a 2010 work at World Cat.〕 While the original motivation was to measure distance between human misspellings to improve applications such as spell checkers, Damerau–Levenshtein distance has also seen uses in biology to measure the variation between DNA.〔The method used in: 〕 == Definition == To express the Damerau–Levenshtein distance between two strings and a function is defined, whose value is a distance between an –symbol prefix (initial substring) of string and a –symbol prefix of . The function is defined recursively as: where is the indicator function equal to 0 when and equal to 1 otherwise. Each recursive call matches one of the cases covered by the Damerau–Levenshtein distance: * corresponds to a deletion (from a to b). * corresponds to an insertion (from a to b). * corresponds to a match or mismatch, depending on whether the respective symbols are the same. * corresponds to a transposition between two successive symbols. The Damerau–Levenshtein distance between and is then given by the function value for full strings: where denotes the length of string and is the length of . 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Damerau–Levenshtein distance」の詳細全文を読む スポンサード リンク
|